5,132 research outputs found
Nonparametric Bayesian Modeling for Automated Database Schema Matching
The problem of merging databases arises in many government and commercial
applications. Schema matching, a common first step, identifies equivalent
fields between databases. We introduce a schema matching framework that builds
nonparametric Bayesian models for each field and compares them by computing the
probability that a single model could have generated both fields. Our
experiments show that our method is more accurate and faster than the existing
instance-based matching algorithms in part because of the use of nonparametric
Bayesian models
Nonparametric Bayesian methods for one-dimensional diffusion models
In this paper we review recently developed methods for nonparametric Bayesian
inference for one-dimensional diffusion models. We discuss different possible
prior distributions, computational issues, and asymptotic results
Nonparametric Bayesian Inference on Bivariate Extremes
The tail of a bivariate distribution function in the domain of attraction of
a bivariate extreme-value distribution may be approximated by the one of its
extreme-value attractor. The extreme-value attractor has margins that belong to
a three-parameter family and a dependence structure which is characterised by a
spectral measure, that is a probability measure on the unit interval with mean
equal to one half. As an alternative to parametric modelling of the spectral
measure, we propose an infinite-dimensional model which is at the same time
manageable and still dense within the class of spectral measures. Inference is
done in a Bayesian framework, using the censored-likelihood approach. In
particular, we construct a prior distribution on the class of spectral measures
and develop a trans-dimensional Markov chain Monte Carlo algorithm for
numerical computations. The method provides a bivariate predictive density
which can be used for predicting the extreme outcomes of the bivariate
distribution. In a practical perspective, this is useful for computing rare
event probabilities and extreme conditional quantiles. The methodology is
validated by simulations and applied to a data-set of Danish fire insurance
claims.Comment: The paper has been withdrawn by the author due to a major revisio
A spatiotemporal nonparametric Bayesian model of multi-subject fMRI data
In this paper we propose a unified, probabilistically coherent framework for the analysis of task-related brain activity in multi-subject fMRI experiments. This is distinct from two-stage “group analysis” approaches traditionally considered in the fMRI literature, which separate the inference on the individual fMRI time courses from the inference at the population level. In our modeling approach we consider a spatiotemporal linear regression model and specifically account for the between-subjects heterogeneity in neuronal activity via a spatially informed multi-subject nonparametric variable selection prior. For posterior inference, in addition to Markov chain Monte Carlo sampling algorithms, we develop suitable variational Bayes algorithms. We show on simulated data that variational Bayes inference achieves satisfactory results at more reduced computational costs than using MCMC, allowing scalability of our methods. In an application to data collected to assess brain responses to emotional stimuli our method correctly detects activation in visual areas when visual stimuli are presented
Incremental Learning of Nonparametric Bayesian Mixture Models
Clustering is a fundamental task in many vision applications.
To date, most clustering algorithms work in a
batch setting and training examples must be gathered in a
large group before learning can begin. Here we explore
incremental clustering, in which data can arrive continuously.
We present a novel incremental model-based clustering
algorithm based on nonparametric Bayesian methods,
which we call Memory Bounded Variational Dirichlet
Process (MB-VDP). The number of clusters are determined
flexibly by the data and the approach can be used to automatically
discover object categories. The computational requirements
required to produce model updates are bounded
and do not grow with the amount of data processed. The
technique is well suited to very large datasets, and we show
that our approach outperforms existing online alternatives
for learning nonparametric Bayesian mixture models
- …